97 research outputs found

    Asynchronous Execution of Python Code on Task Based Runtime Systems

    Get PDF
    Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenience of programming in low-level languages and costs of acquiring the necessary skills required for programming at this level. In recent years, Python, with the support of linear algebra libraries like NumPy, has gained popularity despite facing limitations which prevent this code from distributed runs. Here we present a solution which maintains both high level programming abstractions as well as parallel and distributed efficiency. Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPy operations into code which can be executed in parallel on HPC resources by mapping Python and NumPy functions and variables into a dependency tree executed by HPX, a general purpose, parallel, task-based runtime system written in C++. Phylanx additionally provides introspection and visualization capabilities for debugging and performance analysis. We have tested the foundations of our approach by comparing our implementation of widely used machine learning algorithms to accepted NumPy standards

    Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku

    Full text link
    The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set of interesting challenges to application developers. In addition to requiring code portability across different parallelization schemes, programs targeting these architectures have to be highly adaptable in terms of compute kernel sizes to accommodate different execution characteristics for various heterogeneous workloads. In this paper, we demonstrate an approach to code and performance portability that is based entirely on established standards in the industry. In addition to applying Kokkos as an abstraction over the execution of compute kernels on different heterogeneous execution environments, we show that the use of standard C++ constructs as exposed by the HPX runtime system enables superb portability in terms of code and performance based on the real-world Octo-Tiger astrophysics application. We report our experience with porting Octo-Tiger to the ARM A64FX architecture provided by Stony Brook's Ookami and Riken's Supercomputer Fugaku and compare the resulting performance with that achieved on well established GPU-oriented HPC machines such as ORNL's Summit, NERSC's Perlmutter and CSCS's Piz Daint systems. Octo-Tiger scaled well on Supercomputer Fugaku without any major code changes due to the abstraction levels provided by HPX and Kokkos. Adding vectorization support for ARM's SVE to Octo-Tiger was trivial thanks to using standard C+

    From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

    Get PDF
    We study the simulation of stellar mergers, which requires complex simulations with high computational demands. We have developed Octo-Tiger, a finite volume grid-based hydrodynamics simulation code with Adaptive Mesh Refinement which is unique in conserving both linear and angular momentum to machine precision. To face the challenge of increasingly complex, diverse, and heterogeneous HPC systems, Octo-Tiger relies on high-level programming abstractions. We use HPX with its futurization capabilities to ensure scalability both between nodes and within, and present first results replacing MPI with libfabric achieving up to a 2.8x speedup. We extend Octo-Tiger to heterogeneous GPU-accelerated supercomputers, demonstrating node-level performance and portability. We show scalability up to full system runs on Piz Daint. For the scenario's maximum resolution, the compute-critical parts (hydrodynamics and gravity) achieve 68.1% parallel efficiency at 2048 nodes.Comment: Accepted at SC1

    Methylation of H3-Lysine 79 Is Mediated by a New Family of HMTases without a SET Domain

    Get PDF
    AbstractThe N-terminal tails of core histones are subjected to multiple covalent modifications, including acetylation, methylation, and phosphorylation [1]. Similar to acetylation, histone methylation has emerged as an important player in regulating chromatin dynamics and gene activity [2–4]. Histone methylation occurs on arginine and lysine residues and is catalyzed by two families of proteins, the protein arginine methyltransferase family and the SET-domain-containing methyltransferase family [3]. Here, we report that lysine 79 (K79) of H3, located in the globular domain, can be methylated. K79 methylation occurs in a variety of organisms ranging from yeast to human. In budding yeast, K79 methylation is mediated by the silencing protein DOT1. Consistent with conservation of K79 methylation, DOT1 homologs can be found in a variety of eukaryotic organisms. We identified a human DOT1-like (DOT1L) protein and demonstrated that this protein possesses intrinsic H3-K79-specific histone methyltransferase (HMTase) activity in vitro and in vivo. Furthermore, we found that K79 methylation level is regulated throughout the cell cycle. Thus, our studies reveal a new methylation site and define a novel family of histone lysine methyltransferase

    Accelerated surgery versus standard care in hip fracture (HIP ATTACK): an international, randomised, controlled trial

    Get PDF

    Dictator Games: A Meta Study

    Full text link

    Scalable, Automated Performance Analysis with TAU and PerfExplorer

    Get PDF
    corecore